Proper versus Ad-Hoc MDL Principle for Polynomial Regression

نویسندگان

  • Aleksandar Pečkov
  • Ljupčo Todorovski
  • Sašo Džeroski
  • Jozef Stefan
چکیده

The paper deals with the task of polynomial regression, i.e., inducing polynomial that can be used to predict a chosen dependent variable based on the values of independent ones. As in other induction tasks, there is a trade-off between the complexity of the induced polynomial and its predictive error. One of the approaches for searching an optimal trade-off is the Minimal Description Length principle (MDL). In our previous papers on polynomial regression, we proposed an ad-hoc MDL principle. The focus in this paper is on developing a proper encoding schema for polynomials that leads to a proper MDL principle for polynomial regression. We implemented the developed MDL principle as a search heuristic in CIPER, an algorithm for inducing polynomials from data. We present an empirical comparison between the heuristics based on the ad-hoc and the proper MDL principle. The results show that proper MDL principle leads to simpler polynomials with comparable predictive error. Finally, we also propose a lower bound for the proper MDL principle that allows branch-and-bound pruning of the CIPER search space and evaluate the benefits of pruning.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Layered Representation of Motion Video using Robust Maximum - LikelihoodEstimation of Mixture Models and MDL

Representing and modeling the motion and spatial support of multiple objects and surfaces from motion video sequences is an important intermediate step towards dynamic image understanding. One such representation, called layered representation, has recently been proposed. Although a number of algorithms have been developed for computing these representations, there has not been a consolidated e...

متن کامل

Layered Representation of Motion Video Using Robust Maximum-Likelihood Estimation of Mixture Models and MDL Encoding

Representing and modeling the motion and spatial support of multiple objects and surfaces from motion video sequences is an important intermediate step towards dynamic image understanding. One such representation, called layered representation, has recently been proposed. Although a number of algorithms have been developed for computing these representations, there has not been a consolidated e...

متن کامل

Model Selection with the Loss Rank Principle

A key issue in statistics and machine learning is to automatically select the “right” model complexity, e.g., the number of neighbors to be averaged over in k nearest neighbor (kNN) regression or the polynomial degree in regression with polynomials. We suggest a novel principle the Loss Rank Principle (LoRP) for model selection in regression and classification. It is based on the loss rank, whi...

متن کامل

The Loss Rank Principle for Model Selection

We introduce a new principle for model selection in regression and classification. Many regression models are controlled by some smoothness or flexibility or complexity parameter c, e.g. the number of neighbors to be averaged over in k nearest neighbor (kNN) regression or the polynomial degree in regression with polynomials. Let f̂ c D be the (best) regressor of complexity c on data D. A more fl...

متن کامل

Inference of Gene Regulatory Networks Based on a Universal Minimum Description Length

The Boolean network paradigm is a simple and effective way to interpret genomic systems, but discovering the structure of these networks remains a difficult task. The minimum description length (MDL) principle has already been used for inferring genetic regulatory networks from time-series expression data and has proven useful for recovering the directed connections in Boolean networks. However...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006